Skip to contents

Using a basic example, we walk through the main steps of the methodology to show how it is used.

Preparing the data

To start, install the GeoPressureR package from Github using the following line:

devtools::install_github("Rafnuss/GeoPressureR")

We will be using the following libraries:

Reading geolocator data

In this example, we use data captured on a Great Reed Warbler Acrocephalus arundinaceus (18LX). Below, we read the geolocator data and crop it so that it starts on the equipment date and ends on the retrieval date.

pam_data = pam_read(pathname = system.file("extdata", "/", package = "GeoPressureR"),
                    crop_start = "2017-06-20", crop_end = "2018-05-02")

Automatic classification of activity

We use a k-mean clustering to group periods of low and high activity. We then classify high activities lasting more than 30 minutes as migratory activities. See more possible classifications described in the PALMr manual.

pam_data = pam_classify(pam_data, min_duration = 30)

Editing activity on TRAINSET

To ensure the high level of precision needed for the pressure match, we must manually edit the activity classification and the pressure timeseries to be matched. We suggest doing this with TRAINSET. A separate vignette dedicated to this exercise, including best practices and a sample code to get started, is available here.

Use trainset_write() to export the automatically generated classifications in a csv file, which can be opened in TRAINSET: https://trainset.geocene.com/.

trainset_write(pam_data, pathname=system.file("extdata", "/", package = "GeoPressureR"))
# browseURL("https://trainset.geocene.com/")

Printscreen of the manual classification in TRAINSET. See the labeling track vignette for more information.

When you have finished the manual editing, export the new csv file (TRAINSET will add -labeled in the name). Make sure to keep this classification file (e.g. under /data/).

To edit an existing file, re-open the file on TRAINSET and read this file directly with trainset_read().

pam_data = trainset_read(pam_data, pathname=system.file("extdata", "/", package = "GeoPressureR"))

Identifying stationary periods

Based on the activity labeling, pam_sta() creates a table of stationary periods as illustrated below.

pam_data = pam_sta(pam_data)
knitr::kable(head(pam_data$sta))
start end duration next_flight_duration sta_id
2017-06-20 00:00:00 2017-08-04 19:50:00 1099.83333 hours 205 mins 1
2017-08-04 23:15:00 2017-08-05 19:30:00 20.25000 hours 440 mins 2
2017-08-06 02:50:00 2017-08-06 19:15:00 16.41667 hours 475 mins 3
2017-08-07 03:10:00 2017-08-07 19:15:00 16.08333 hours 295 mins 4
2017-08-08 00:10:00 2017-08-29 18:40:00 522.50000 hours 590 mins 5
2017-08-30 04:30:00 2017-08-30 18:45:00 14.25000 hours 565 mins 6

We can visualize the pressure measurements for each grouped stationary period (symbolized by a different color).

p <- subset(pam_data$pressure, sta_id != 0) %>% 
  ggplot() +
  geom_point(data=pam_data$pressure, aes(x=date,y=obs), colour="black",size=0.25) +
  geom_line(aes(x=date,y=obs,col=as.factor(sta_id))) + 
  theme_bw() +
  scale_y_continuous(name="Pressure(hPa)") +
  scale_colour_manual(values=rep(RColorBrewer::brewer.pal(9,"Set1"),times=4))
  #scale_colour_brewer(type="qualitative", palette = 'Set1')

ggplotly(p, dynamicTicks = T)

Computing the map of pressure

Now that we have clean pressure timeseries for each stationary period, we are ready to match each one with atmospheric pressure data (ERA5). To overcome the challenge of computing mismatch on such a large dataset, this R package uses the API GeoPressure to perform the computation on Google Earth Engine. Read more about the API here.

Initially, it is easier and faster to query only long stationary periods (in the example below, we select only periods longer than 12hrs). You can do so by setting the pressure of the stationary periods you wish to discard to NA.

sta_id_keep = pam_data$sta$sta_id[difftime(pam_data$sta$end,pam_data$sta$start, units = "hours")<12]
pam_data$pressure$sta_id[pam_data$pressure$sta_id %in% sta_id_keep] = NA

We can now query the data on the API with geopressure_map(). A detailed description of the parameters can be found here.

extent = c(-16,20,0,50) # coordinates of the map to request (W,E,S,N) 
scale = 10 # request on a 0.1° grid to make the code faster
max_sample = 250 # limit the query to the first 250 datapoints. 
margin = 30 # roughly equivalent to 3hPa
raster_list = geopressure_map(pam_data$pressure, extent = extent, scale = scale, max_sample = max_sample, margin = margin)

geopressure_map() returns a list of two rasters for each stationary periods. The first is the mean square error (\(\textbf{MSE}\)) between the pressure timeseries and ERA5 map. The second (\(\textbf{z}_{thr}\)) is the proportion of datapoints in the pressure timeseries which correspond to an altitude that falls between the min and max altitude of each grid cell. Read more about these values and how they are computed here.

We then combine the two rasters in a single probability map using \[\textbf{P} = \exp \left(-w \frac{\textbf{MSE}}{s} \right) [\textbf{z}_{thr}>thr]\] where \(s\) is the standard deviation of pressure and \(thr\) is the threshold mask. Because the auto-correlation of the timeseries is not accounted for in this equation, we use a log-linear pooling weight \(w=\log(n) - 1\), where \(n\) is the number of datapoints in the timeseries. This operation is described in publication […]. Another vignette describing the influence of log-linear pooling and length of timeseries will be added later.

s = 0.4 # standard deviation of pressure
thr = 0.9 # threashold of the threahold proportion value acceptable
prob_map_list = geopressure_prob_map(raster_list, s=s, thr=thr)

We use leaflet() to visualize the threshold mask, mismatch map, and overall probability map for a single stationary period.

i_r = 1;
leaflet() %>% addTiles() %>%
  addRasterImage(prob_map_list[[i_r]], opacity = 0.8, group="Probability") %>%
  addRasterImage(raster_list[[i_r]][[1]], opacity = 0.8, group="Mismatch") %>%
  addRasterImage(raster_list[[i_r]][[2]], opacity = 0.8, group="Threashold") %>%
  # addLegend(pal = pal, values = values(v[[i_s]][[3]]), title = "Probability") %>% 
  addLayersControl(
    overlayGroups = c("Probability","Mismatch","Threashold"),
    options = layersControlOptions(collapsed = FALSE)
  ) %>% hideGroup(c("Mismatch","Threashold"))

We can also visualize the probability map for all stationary periods:

li_s=list()
l = leaflet() %>% addTiles() 
for (i_r in 1:length(prob_map_list)){
  i_s = metadata(prob_map_list[[i_r]])$sta_id
  info = pam_data$sta[pam_data$sta$sta_id==i_s,]
  info_str = paste0(i_s," | ",info$start,"->",info$end)
  li_s <- append(li_s, info_str)
  l = l %>% addRasterImage(prob_map_list[[i_r]], opacity = 0.8, group=info_str) 
}
l %>% 
  addLayersControl(
    overlayGroups = li_s,
    options = layersControlOptions(collapsed = FALSE)
  ) %>% hideGroup(tail(li_s,length(li_s)-1))

Computing altitude

The second operation you can perform with GeoPressureR is to compute the exact altitude of the bird \(z_{gl}\) from its pressure measurement \(P_{gl}\) and assuming its location \(x\). This function uses ERA5 to adjust the barometric equation, \[ z_{gl}(x)=z_{ERA5}(x) + \frac{T_{ERA5}(x)}{L_b} \left( \frac{P_{gl}}{P_{ERA5}(x)} \right) ^{\frac{RL_b}{g M}-1},\] where \(z_{ERA}\), \(T_{ERA}\) and \(T_{ERA}\) respectively correspond to the ground level elevation, temperature at 2m and ground level pressure of ERA5, \(L_b\) is the standard temperature lapse rate, \(R\) is the universal gas constant, \(g\) is the gravity constant and \(M\) is the molar mass of air. See more information here.

We can compute the bird’s elevation for its first stationary period using the most likely position on the probability map.

i_r = 1
i_s = metadata(prob_map_list[[i_r]])$sta_id
tmp = as.data.frame(prob_map_list[[i_r]],xy=T)
lon = tmp$x[which.max(tmp[,3])]
lat = tmp$y[which.max(tmp[,3])]

We extract the pressure for this stationary period:

id = pam_data$pressure$sta_id==i_s & !is.na(pam_data$pressure$sta_id)
pressure = list(
  obs = pam_data$pressure$obs[id],
  date = pam_data$pressure$date[id],
  class = pam_data$pressure$class[id]
)

And then call the function geopressure_ts:

ts_list[[i_r]]  = geopressure_ts(lon, lat, pressure = pressure)

We can compare the altitude produced to the one computed without the correction for temperature and pressure:

Lb = -0.0065
R = 8.31432
g0 = 9.80665
M = 0.0289644
T0 = 273.15+15
P0 = 1013.25
ts_list[[i_r]]$altitude_baro = T0/Lb * ((ts_list[[i_r]]$pressure/P0)^(-R*Lb/g0/M) - 1 )

and visualize this comparison:

p <- ggplot() +
  geom_line(data=as.data.frame(ts_list[[i_r]]), aes(x=date,y=altitude, col=as.factor("Corrected elevation with ERA5"))) + 
  geom_line(data=as.data.frame(ts_list[[i_r]]), aes(x=date,y=altitude_baro, col=as.factor("Uncorrected elevation"))) + 
  labs(col="") +
  theme_bw()

ggplotly(p)

The function geopressure_ts() also returns the ground level pressure timeseries from ERA5 at the location specified. This is useful to check whether there is a good match between the pressure measured by the geolocator and the one at the assumed location. This operation is typically used to check the quality of the manual labeling (see the labeling track vignette).